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ABSTRACT 

Blum's  Medial  Axis  Transformation  (MAT)  of  the  set  S  of 
l's  in  a  binary  picture  can  be  defined  by  an  iterative  shrink¬ 
ing  and  reexpanding  process  which  detects  "corners"  on  the 
contours  of  constant  distance  from  S,  and  thereby  yields  a 
"skeleton"  of  S.  For  unsegmented  (gray  level)  pictures,  one 
can  use  an  analogous  definition,  in  which  local  MIN  and  MAX 
operations  play  the  roles  of  shrinking  and  expanding,  to  com¬ 
pute  a  "MMMAT  value"  at  each  point  of  the  picture.  The  set  of 
points  having  high  values  defines  a  good  "skeleton"  for  the  set 
of  high-gray-level  points  in  the  given  picture. 
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1.  Introduction 


Let  S  be  a  subset  of  a  picture,  let  P  be  a  point  of  S, 
and  let  D(P)  be  the  largest  "disk"  (or  neighborhood  of  some 
specified  shape)  centered  at  P  that  is  contained  in  S.  We 
call  D(P)  a  maximal  disk  of  S  if  it  is  not  contained  in  D(Q) 
for  any  Q^P.  Evidently,  S  is  the  union  of  its  maximal  disks. 

The  "medial  axis  transform"  (MAT)  [1]  of  S  consists  of  the 
centers  of  these  disks  together  with  their  radii.  In  digital 
pictures  "disks"  are  usually  approximated  by  squares,  whose 
orientation  depends  on  the  definition  of  distance  in  the  grid. 

When  the  "chessboard"  distance  (d( (a,b) , (c,d) )  *  max ( | a-c | , | b-d | ) ] 
is  used,  the  "disks"  are  upright  squares.  When  the  "city  block" 
distance  [d( (a,b) , (d,c) )  *  |a-c|+|b-d|]  is  used,  the  "disks" 
are  an  approximation  of  diagonal  squares. 

An  equivalent  definition  of  MAT  uses  paths  from  a  point  to 
the  boundary.  The  distance  of  a  point  x  in  S  from  S  is  the 
length  of  a  shortest  path  from  x  to  the  complement  S.  The  MAT 
can  then  be  defined  as  the  set  of  all  points  in  S  which  do  not 
belong  to  the  minimal  path,  of  any  other  point,  together  with 
their  distances.  It  has  been  shown  [2]  that  for  digital  pictures 
using  discrete  distance  metrics  the  points  in  the  MAT  are  those 
points  whose  distances  from  S  are  local  maxima.  The  MAT  can  be 


regarded  as  a  generalized  axis  of 
constitutes  a  kind  of  "skeleton". 


Several  generalizations  of  the  MAT  have  been  proposed, 
based  on  these  definitions,  which  allow  a  MAT  to  be  defined 
for  a  gray  level  digital  picture,  rather  than  for  a  two-valued 
picture  representing  a  set  S  (I's  at  points  of  S,  0's  elsewhere) 
One  generalization  [3],  the  SPAN  (Spatial  Piecewise  Approxima¬ 
tion  by  Neighborhoods) ,  is  defined  in  terms  of  maximal  homoge¬ 
neous  disks;  the  given  picture  can  be  approximated  if  we  are 
given  the  set  of  centers,  radii,  and  average  gray  levels  of 
these  disks  [5] .  If  the  picture  is  two-valued,  and  "homogeneous 
means  "constant-valued",  the  SPAN  reduces  to  the  MAT.  Another 
generalization,  the  GRAYMAT  [4] ,  is  based  on  the  concept  of 
gray-weighted  distance:  the  gray-weighted  length  of  a  path  is 
proportional  to  the  sum  (or  integral)  of  the  gray  levels  along 
the  path;  the  gray -weighted  distance  between  two  points  is  the 
lowest  gray-weighted  length  of  any  path  between  them.  The 
GRAYMAT  is  the  set  of  all  points  which  do  not  belong  to  any 
minimal  gray-weighted  path  from  any  other  point  to  the  zero¬ 
valued  background,  together  with  the  corresponding  distance. 

This  too  reduces  to  the  MAT  in  the  two-valued  case.  Still 
another  generalization,  the  GRADMAT  [5],  computes  a  score  for 
each  point  P  of  a  picture  based  on  the  gradient  magnitudes  at 
all  pairs  of  points  that  have  P  as  their  midpoint;  thus  these 
scores  are  high  at  points  that  lie  midway  between  pairs  of  anti¬ 
parallel  edges,  so  that  they  define  a  weighted  "medial  axis". 


Bach  of  these  generalizations  has  disadvantages.  The  SPAN 
is  costly  to  compute/  since  it  involves  testing  neighborhoods 
of  all  sizes  at  every  point  for  homogeneity.  The  GRAYMAT  is 
defined  relative  to  the  set  of  0's  in  the  picture,  so  that  it 
requires  the  picture  to  be  segmented  into  0's  ("background") 
and  non-O's  ("objects").  The  GRADMAT  turns  out  to  be  rather 
sensitive  to  noise  and  to  irregularities  in  region  edges. 

This  paper  proposes  a  new  grayscale  generalization  of  the 
MAT  which  is  inexpensive  to  compute,  does  not  require  the  pic¬ 
ture  to  be  segmented,  and  is  insensitive  to  noise.  Its  defini¬ 
tion  is  based  on  the  fact  that  the  MAT  of  a  set  S  can  be  con¬ 
structed  by  a  process  of  iteratively  shrinking  and  reexpanding 
S  [6].  For  grayscale  pictures,  the  operations  of  local  MIN  and 
local  MAX  are  generalizations  of  shrinking  and  expanding, 
respectively  [7] .  Thus  if  we  use  iterated  local  MIN  and  MAX 
instead  of  shrinking  and  expanding,  we  obtain  a  "MAT"  construc¬ 
tion  that  is  applicable  to  grayscale  pictures.  The  resulting 
"MAT"  will  be  called  the  MMMAT  (short  for  "min-max  MAT") . 

Section  2  reviews  the  shrink/expand  construction  of  the  MAT 
and  defines  its  min/max  generalization.  Section  3  shows  that  this 
MMMAT  construction  yields  reasonable  "medial  axes"  in  a  variety 


of  cases. 


2 .  The  MMMAT 


The  MAT  can  also  be  defined  by  a  propagation  process  start¬ 
ing  at  the  contour  of  the  figure,  and  propagating  toward  the  in¬ 
side  of  the  figure.  The  contour  is  the  initial  wavefront  of  the 
propagation  process,  and  the  propagation  velocity  is  fixed. 
Wavefront  superposition  is  not  allowed,  and  wavefront  intersec¬ 
tion  points  are  the  points  of  the  MAT.  The  gray  level  extension 
of  this  "grass  fire"  definition  was  given  in  [6],  where  the  pro¬ 
pagation  velocity  is  inversely  proportional  to  the  gray  level. 

Following  the  above  definition,  the  propagation  of  a  wave- 

front  in  a  binary  digital  picture  can  be  modelled  by  a  sequence 

of  "shrink"  operations,  and  the  MAT  can  be  constructed  by  a 

simple  process  of  iterated  shrinking  and  reexpanding  using  the 

appropriate  neighborhood  (4-neighborhood  for  the  city  block 

distance,  8-neighborhood  for  the  chessboard  distance) .  Let 
(k) 

S  denote  the  result  of  "expanding"  S  k  times,  where  a  single 
expansion  step  (S^)  means  that  all  points  of  S  which  are  neigh 
bors  of  points  in  S  are  adjoined  to  S.  Similarly,  let 
be  the  result  of  "shrinking"  S  k  times?  a  single  shrinking  step 
means  that  all  points  of  S  which  are  neighbors  of  points  in  S 
are  deleted  from  S.  Shrinking  S  is  evidently  equivalent  to 
expanding  S,  and  vice  versa.  A  point  is  in  S^”k^  iff  its 
distance  from  ?  is  at  least  k;  here  the  distance  is  city  block 
if  we  use  only  horizontal  and  vertical  neighbors  in  the  defini¬ 
tion  of  shrinking,  and  the  distance  is  chessboard  if  we  also 


use  diagonal  neighbors. 


It  can  be  shown  that  for  all  nonnegative  i  and  j  we  have 

(S'~i))(j)  s  c  £^3  j.n  particular,  for  all 

nonnegative  k  we  have  c  S^”lc+^.  The  difference  set 

^+1) _ (g ("h) j  (1)  consists  Qf  points  whose  distances  from 

S  are  k-1,  and  which  have  no  neighbor  at  distance  k  or  greater; 

hence  the  discrete  case  is  just  the  set  of  distance  maxima 

at  distance  k-1  from  S.  Thus  UDV  is  the  set  of  all  distance 

k  K 

maxima,  i.e.,  of  MAT  points. 

Shrinking  S  is  equivalent  to  performing  a  local  MIN  opera¬ 
tion  on  the  two-valued  picture  that  has  l's  at  the  points  of  S, 
and  expanding  S  is  equivalent  to  performing  a  local  MAX  operation 
on  this  picture,  where  "local"  is  defined  in  terms  of  the  appro¬ 
priate  set  of  neighbors.  For  a  gray  level  digital  picture  I, 

(k) 

let  I  be  the  result  of  applying  k  iterations  of  local  MAX  to 
(-k) 

I,  and  let  £  be  the  result  of  k  iterations  of  local  MIN. 


It  can  be  shown  [7]  that  for  all  nonnegative  i  and  j  we  have 
*  E^”^  *  (j-(j))(-i).  thus  particular,  for  all 
nonnegative  k  we  have  (E  ^  &  j.(-k+l),  so  that  the  dif¬ 
ference  picture  =  E  -  (E  is  everywhere  nonnega¬ 

tive  (all  picture  operations  are  performed  pointwise) .  If  E  is 


a  two-valued  picture  and  S  is  its  set  of  l's,  then  the  set  of 


l's  of  Ak  is  just  D^. 

In  the  two-valued  case,  when  we  shrink  S,  a  given  point  P 


of  S  remains  unchanged  until  k  -  d(P,S),  and  then  changes  to  0; 


but  in  the  general  case,  when  we  iterate  local  MIN,  the  value 
of  P  may  change  many  times.  Let  Zk(P)  be  the  lowest  gray  level 
within  distance  *k  of  P;  thus  Zg  *  *,...,  where  Zg  is  P's 

gray  level  in  E=E  ^  .  Readily,  Zk(P)  is  the  gray  level  E^  k^  (P) 
of  P  in  E(_k).  If  Zk(P)=  Zk_1(P),  Ak  must  be  0  at  P,  since 
the  max  of  Zk(P)  and  its  neighbors  in  Ak  is  at  least  (hence 
exactly)  Zk_1 (P) ;  but  if  Zk(P)  <  zk-1(P) *  Ak  maY  be  >°  at  p* 

The  MMMAT  value  of  P  can  be  defined  in  terms  of  the  Ak(P) 
values  (k=l,2,...)  in  several  ways.  One  possibility  is  to  use 
their  maximum;  ancther  is  to  use  their  sum.  As  we  shall  see  in 
the  next  section,  both  of  these  definitions  yield  MAT-like  loci 
of  high  MMMAT  values.  It  is  evident  that  the  max  definition 
yields  values  in  the  same  range  10,2)  as  the  picture’s  grayscale, 
since  0  *  Afc  =  E  (~k+1)  -  (E  (^k)  )  (1)  *E  (_k+1)  s.z  for  all  k.  For  the 
sum  definition  too,  we  have  0  *  Ak  *  EAk.  On  the  other  hand, 

E  (£("k)-El_k~11 )  -  i  Z;  and  since  (E l"k"1) )  111 

a*?'-"-1’,  this  implies  E<E<-k) ) 111 )  *  Z. 

When  the  local  MIN  operation  is  iterated  many  times,  border 
effects  become  a  serious  problem.  In  the  two-valued  case,  if 
we  require  that  S  be  interior  to  the  picture,  then  the  border  of 
the  picture  consists  entirely  of  0's,  and  we  can  treat  the  out¬ 
side  of  the  picture  as  consisting  of  0's  without  creating  any 
artifacts.  In  the  grayscale  case,  however,  whatever  value (s) 


we  use  outside  the  picture  will  have  effects  on  their  neighbors 
inside  it,  and  as  the  process  is  iterated,  these  effects  propa¬ 
gate,  as  we  will  see  in  the  next  section. 


3 .  Examples  and  concluding  remarks 

Figure  1  shows  eight  pictures  and  their  MMMATs  computed  in 
three  different  ways:  max  Ax  using  eight-  and  four-neighbor 
operations,  and  EA^  using  eight-neighbor  operations.  The  four- 
neighbor  version  contains  artifacts  due  to  border  effects, 
resulting  from  the  fact  that  the  outside  of  the  picture  is 
treated  as  consisting  of  0's.  In  all  cases,  the  high  MMMAT 
values  constitute  very  reasonable  "skeletons"  of  the  dark  points 
in  the  given  picture. 

In  the  two-valued  cases,  S  can  be  reconstructed  from  its 
MAT  by  a  reexpansion  process;  in  fact,  S  is  the  union  of  the 
disks  centered  at  the  MAT  points  and  with  radii  equal  to  the 
distance  values  of  those  points.  In  the  grayscale  case,  analo¬ 
gously,  if  we  know  Ax ( Z) , A2 ( Z) , . . . , Am( Z)  for  each  Z,  we  can  re¬ 
construct  £  from  £(-m)  by  an  iterated  local  MAX  process,  where 
at  each  step  we  add  the  appropriate  A  value  back  into  the  picture. 

Specifically,  given  £^-m^,  we  have  E^  m+1^  =  ( E  ^  ^  ^  +Am; 

z(-m+2)  ,  (E(-m+l))(l)+A  ...;I  =  Z(0)  =  (E(-1))  (1)+A..  Note, 

however,  that  this  reconstruction  processs  requires  a  large 
amount  of  information,  namely  m  arrays  of  A  values,  unlike  the 
two-valued  case  where  we  only  need  a  single  distance  value  for 
each  point.  This  is  a  consequence  of  the  fact  that  in  the  MAT 
construction  process,  the  value  of  a  point  changes  from  1  to  0 
only  once  (for  k  equal  to  its  distance  from  S) ,  whereas  in  MMMAT 
construction,  the  value  of  a  point  may  change  at  every  iteration. 


In  any  case,  the  picture  cannot  be  reconstructed  from  its 


MMMAT  values,  since  these  are  maxes  or  sums  of  A^'s,  and  we  need 
all  of  the  individual  A^  values  for  correct  reconstruction. 

It  has  been  suggested  [1]  that  biological  visual  systems 
compute  MATs  and  use  them  to  extract  perceptually  significant 
features  from  shapes  and  patterns  (e.g.,  lobes  on  a  shape  corre¬ 
spond  to  branches  on  its  MAT) .  However,  it  seems  implausible 
that  visual  systems  threshold  their  input,  which  would  be  neces¬ 
sary  for  MAT  computation.  The  MMMAT  provides  a  possible  alter¬ 
native  approach  in  which  medial  axes  can  be  computed  from  un- 
thresholded  input. 
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